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ABSTRACT 



We explore whether medium-resolution stellar spectra can be reconstructed 
from photometric observations, taking advantage of the highly compressible na- 
ture of the spectra. We formulate the spectral reconstruction as a least-squares 
problem with a sparsity constraint. In our test case using data from the Sloan 
Digital Sky Survey, only three broad-band filters are used as input. We demon- 
strate that reconstruction using three principal components is feasible with these 
filters, leading to differences with respect to the original spectrum smaller than 
5%. We analyze the effect of uncertainties in the observed magnitudes and find 
that the available high photometric precision induces very small errors in the re- 
construction. This process may facilitate the extraction of purely spectroscopic 
quantities, such as the overall metallicity, for hundreds of millions of stars for 
which only photometric information is available, using standard techniques ap- 
plied to the reconstructed spectra. 

Subject headings: methods: statistical, stars — statistics, surveys 



1. Introduction 



^There are more than 200 photometric systems that have been used in astronomy (IBessell 

20051 . and references therein). The amount of information about a star that can be ex- 



tracted from photometry is highly dependent on the choice of photometric filters. Early on, 
Stromgren introduced the uvby syst em with the aim of cha racterizing the main stellar atmo- 
spheric parameters and reddening (jStromgrenI Il95ll Il956l ). Stromgren's filters have widths 
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of the order of 200 A, and are thus considered intermediate-band. Other systems with simi- 
lar and narrower passbands have been introduced since, but most photometric systems use 
filters significantly broader than Stromgren's, and therefore tend to provide lower sensitivity 
to the atmospheric parameters. 

Until recently, the most widely-used photometric system was the broad-band Johnson- 
Cousins UBVRI, but with the advent of the Sloan Digital Sky Survey (SDSS), which includes 
CCD-ba sed photometry for 357 million unique optical sources over m ore than 11,000 square 
degrees (lAbazajian et al.l 120091 ). and 2MASS (ISkrutskie et al.l 120061 ). which includes nearly 
half a million near-IR sources over the entire sky, these new systems have taken over. The 
hegemony of the SDSS system in the optical is illustrated by the fact that new and future 
instruments, such as the Lar ge Synoptic Surve y Telescope (LSST)Q, the Dark Energy Survey 
(DES) camera^, or OSIRIS ( Cepa et al. 2000) on the Gran Telescopio Canarias (GTC) are 
adopting the same system. 

D espite the w i dths o f the SDSS filters are 3-6 times larger than the Stromgren pass- 
bands, |lvezic_et.alj (l2008|) have shown that, if the reddening is known with sufficient accuracy, 
it is possible to estimate stellar effective temperatures for late-type stars with a typical pre- 
cision of ~ 100 K, and metallicities with a precision of ~ 0.2 dex from SDSS photometry 
alone for stars of moderate metallicities. The sensitivity of SDSS photometry to surface 
gravity is much weaker, and disentangling this parameter from the other two, even in the 
absence of reddening, may not be possible, but in any case we are not aware of any sucessful 
calibration. 

Because of the vast number of sources with available SDSS photometry, it is desirable 
to ensure that we are extracting all possible information captured by this system. The most 
straightforward techniques for mapping photometric indices into the quantities of interest, 
such as stellar atmospheric parameters, have provided limited success, and the time is ripe 
to explore new avenues. This work examines the p ossibility of using a simple techni que 
inspired on the concept of compressed sensing (CS; ICandes et al.l l2006l : iDonohd l2006l ) to 
reconstruct spectroscopic data of stellar objects from SDSS photometry. If there is an acces- 
sible mapping between photometry and intermediate-dispersion spectra, the available tools 
for spectroscopic analysis could then be applied to the reconstructed spectra in order to 
recover the parameters of interest. 



The recent work on SDSS spectra by iMcGurk et al.l (120101 ) indicates that for stars in 



^See http://www.lsst.org 
^See http: / /www. darkenergysurvey.org 
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narrow color bands (about 0.02 mag wide in g — r), principal component analysis (PCA) 
can be applied, and most of the variance in the spectra is recovered with just 4 principal 
components. This result strongly suggests that the SDSS intermediate-resolution spectra are 
highly sparse, and the most relevant information in the data can be compressed into just four 
numbers. This statement should be accompanied by a warning: SDSS spectra typically have 
a modest signal-to-noise ratio (most SDSS spectra are for stars in the range 16 < g < 18.6 
and have typi c al sign al-to-noise ratios at 500 nm between 65 and 8). We also note that 
McGurk et al.l (120101 ) used the median difference between the original and reconstructed 
spectra to quantify the level of agreement. Hence, significantly larger differences between 
the original and the spectra recovered with four components are expected for a small fraction 
of their sample, as we show in this paper. 

If the spectra are indeed sparse, there is a goo d chance that photometry alone can 



be used to reconstru ct them to some precision (see lAsensio Ramos fc Lopez Aristd 12010 
Asensio Ramosll2010l . for similar applications). In this paper, we explore this possibility in 
detail, in particular for the case when SDSS stellar spectr a can be repre s ented as a linear 
combination of a small number of vectors, as concluded by iMcGurk et al.l (120101 ). Section 2 
describes the basic concept and develops the mathematical method. Section 3 applies the 
method to SDSS spectra and photometry. Section 4 discusses the error propagation from 
the photometry to the reconstructed spectra, and Section 5 summarizes our findings. 



2. Sparsity and reconstruction 

During the last few years, the emerging theory of compressed sensing is showing that 
the Nyquist-Shannon sampling theorem is too restrictive in case some details of the signal 
structure are known in advance. The interesting point of the new CS paradigm is that, in 
many instances, natural signals have a structure that is known in advance. The key point is 
that, typically, only few elements of the basis set in which we develop the signal are necessary 
for an accurate description of the important physical information. Instead of measuring the 
full signal (wavelength variation of the stellar spectrum in our case), under the CS framework 
one measures a few linear projections of the signal along some vectors known in advance and 
reconstructs the signal solving a non-linear problem. 

Explicity, let f be a vector of length M that represents the sampled wavelength variation 
of the stellar spectrum. A standard spectrograph measures the spectrum by accumulating 
photons in wavelength bins determined by the spectral resolution. Instead, we propose to 
measure scalar products of the signal with carefully selected vectors (multiplex measure- 
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ments), so that: 

y = $f + e, (1) 

where y is the vector of measurements of dimension A^, $ is an x M sensing matrix that 
we particularize below and e is a vector of dimension A^ that characterizes the noise on the 
measurement process. Note that the previous equation describes the most general linear 
multiplexing scheme in which the number of measurements A^ and the length of the signal 
M may differ. In the most standard multiplexing situation, the number of scalar products 
measured equals the dimension of the signal (A^ = M) and it is possible to recover the vector 
f provided that rank($) = A^ (in other words, that every row of the ^» matrix is orthogonal 
with respect to every other row). 

Our aim, though, is to solve the previous linear system (i.e., obtain the spectrum f) from 
the smallest possible number of measurements y. In general, this can be accomplished by 
solvi ng the linear syste m using the singular value decomposition (SVD) of the $ matrix (see, 
e.g., iPress et al.lll986[) . The solution through the SVD fulfills that it is the one producing 
the smallest ^2-norno of the residuals, or equivalently, the least-squares solution. When 
A^ <^ M, this solution is strongly affected by noise and is practically useless in general. 



However, this problem can be overcome if the ingredient of sparsity is invoked. The 
success of CS techniques is fundamentally based on the idea that, if the signal of interest 
is sparse in a certain basis set (or can be efficiently compressed in this basis set), the re- 
construction is made possible. Any compressible signa£| can be written, in general, in the 
following way: 

f = W^a, (2) 

where now a is a i^-spars^ vector of size M and W"!" is the transpose of an M x M transfor- 
mation matrix associated with the basis set in which the signal is sparse. For instance, W 
can be the Fourier matrix if the signal f is the combination of a few sinusoidal components. 
In our case, we will use the transformation matrix associated with the principal components. 

The combination of the previous two ingredients leads to the following multiplexing 
scheme: 

y = #W^a + e, (3) 



■^The ^n-norm of a vector is given by || x ||„= |a:i|")^/" if n > 0. The pseudo-norm is given by the 
number of non-zero elements of x. 

A signal is said to be compressible (or quasi-sparse) if it is possible to find a basis for which the projection 
coefficients along the vectors of the basis reordered in decreasing magnitude decay in absolute value like a 
power-law. 

vector is if-sparse if only K elements of the vector are different from zero. 
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with the hypothesis that a is sparse, i.e., that the £o-norm of a is as small as possible. 



2.1. Sparsity 



Principal component analysis (PCA; IPearsoru llQOll : iKarhunenI 119471 : iLoevd 119551 ) has 
been applied to SPSS data with the aim of classification, noise-reduction and compression 



(e.g., iConnoUy et al.l Il995l : lYip et al.l |2004| ) for different types of objects. The principal 
components (eigenvectors of the correlation/covariance matrix of the database) represent a 
complete basis set for a given database of spectra. 

One of the advantages of the PCA decomposition is that the importance of an eigen- 
vector (measured as the associated absolute value of the eigenvalue) decays typically like a 
power-law. Therefore, if one considers that only M principal components contribute signif- 
icantly to the reconstruction of a spectrum, it can be seen that the vector a in Eq. (|2]) is 
non-zero only in the first M elements, and approximately zero in the rest. Additionally, the 
matrix is built from the principal components ordered from the absolute value of their 
associated eigenvalues as columns. 



Recently, iMcGurk et al.l (|2010[ ) have applied PCA to SDSS stellar spectra. They have 
analyzed a subset of the full spectral database of SDSS and calculated the principal com- 
ponents separately for stars in intervals of 0.02 mag in the g — r color. The range of colors 
consi dered spans —0.2 < g — r < 0.9, corresponding to MK spectral types A3 to K3. Accord- 
ing to llvezic et al.l ( l2005l ) , this segregation in g — r color is roughly equivalent to a segregation 
in effective temperature due to the large correlation between this parameter and the g — r 
color. It is also of interest to point out that the effect of reddening is limited by selecting 
stars with an estimated extinction below 0.3 mag in the r band. Thanks to the binning, 
the number of principal components needed in each interval to reach noise level is highly 
reduced. They demonstrate that the mean spectrum plus three principal components (here- 
after referred to as the first four principal components) are more than enough to statistically 
reconstruct the stellar spectra at the noise level. 



As a caveat, note that the quality of the principal component decomposition of lMcGurk et al 

(120101 ) is only measured through the median difference. It is then expected that ~50% of the 



stars in each bin have a decomposition that reproduce the spectra with a difference larger 
than the noise level (see §2.3p . If a different binning is proposed in the future leading to new 
(hopefully improved) principal components, our reconstruction scheme remains unchanged 
and can be compu ted using exactly the same observations. Thankfully, the segregation of 
Ivezic et al.l (120051 ) is done using an observed quantity and the bin can be known just using 
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photometric data. We take this highly efficient PCA decomposition for reconstructing SDSS 
stellar spectra from photometric measurements. 



2.2. Sensing matrix 

In addition to the sparsity condition, the other important ingredient of our technique 
relies on the election of the sensing matrix. This matrix is the one that relates the underlying 
spectrum with the measurements we use in the reconstruction. Our aim is to test whether 
photometric data can be used to reconstruct spectra, so that the sensing matrix is not a 
choice, but given by the weighting functions of the SDSS ffiter set. Figure [H shows the 
ugriz filter set and an example of an observed spectrum. Note that filters u and z have 
important contributions outside the observed spectral range. Consequently, the information 
they contain cannot be easily utilized under the scheme presented in this paper and we carry 
out the reconstructions using only filters g, r and i. 

Obtaining the magnitude in the filter k of the SPSS system from spectroscopic data 
reduces to the calculation of the following quantity (IFukugita et al.l 119961 ): 



mfe = -2.51ogio — f , . 48.6, 4 

j 5fc(z/)dlnz/ 

where /{u) is the flux distribution of the star. Therefore, ^(z/) is the effective transmissiv- 
ity of the filter, including the filter response, the CCD quantum efficiency and the typical 
transmission of the sky for a point source (in our cases adopted for an airmass of 1.3|^). The 
flux measured for each ffiter can be estimated using the Riemann integral as (note that more 
precise quadrature rules can be used without a significant change in the following discussion): 

where we have made the integrati on in the wavel e ngth axis and used /(A) instead of /(z/) 



since the principal components of iMcGurk et al.l (|2010[ ) are given in terms of /(A). The 



normalization constant for each filter is obtained likewise 

N 



/AA 
5,.(z/)dlnz/^^5,(z/,)^. (6) 



http://www.sdss.org/dr7/instrunients/iniager/filters/index.html 
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Note that one is able to isolate the flux from the observed magnitudes as: 

1 ^ A 
10-o.4(„.,+48.6) ^ ^^/(A,)5,(z.,)^AA,. (7) 

1=1 

Consequently, the flux associated to the measured photometric quantity can be written as 
the dot product of the original flux distribution /(Aj) and a weighting function, so that each 
column of the sensing matrix $ in Eq. ([T]) is given by: 

= ^^AA. (8) 



2.3. Reconstruction 

The sparsity constraint of the spectrum is fulfllled automatically when using a principal 
component decomposition, with the additional advantage of knowing exactly which coeffi- 
cients of the sparse vector a are non-zero. Therefore, the solution of the problem given by 
Eq. (jS]) is simpler than the full CS problem in which the non-zero elements of a have to be 
identified. Consequently, the solution to Eq. is given by the sparse vector that minimizes 
the following ^2-iiorm: 

||y-#W^a||2. (9) 

In other words, we look for the sparse vector a with the first K elements different from zero 
and the rest set to zero that minimizes the square difference between the photometric flux 
on the (yf, r and i fllters and the ones reconstructed using the previous formalism, where the 
flux is obtained as a linear combination of K principal components. We now develop in more 
detail the steps to be followed. 

Assume that the signal of interest -F(A) can be written as a linear combination of K 
(sparsity) PCA basis functions Bi{X), so that: 

K 

F{X,) = J2akBk{X^) V^ = 1,...,M. (10) 

k=l 

The measurement process produces the following linear combinations: 

M 

= J]$,,F(A,) \/j = l,...,N, (11) 

i=l 

where M is the number of wavelength points, is the number of measurements and the $ij 
are the matrix elements of the sensing matrix given in Eq. (|8]). Plugging Eq. f lTOj) into 
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Eq. flTTl) . we end up with: 



M 



K 



K 



M 



Vj = ^ ^ij ^ akBk{\i) = ctfc $ij5fc(Ai), 

i=l k=l k=l i=l 

where, making the substitution t;^ = ^ijBk{Xi), can be written as: 

K 



(12) 



(13) 



fc=i 



The minimization of Eq. (JH]) can be easily done calculating the derivatives with respect to 
each Ofc and equating them to zero. In other words, given the vector o of length N with the 
observations (photometry), we define the metric function: 



X 



y- 



K 



-I 2 



k=l 



where is the variance associated to Oj and discussed in 
following set of linear equations for 0^: 



(14) 

Then, we end up with the 



K N j j N 



k=i j=i 



,=1 <■ 



K 



In matrix form, we have: 
where 



Ga = b, 



(15) 



(16) 



G 



kl 



Elkh_ 

j=l Oj 



N 



(17) 



The t's are defined from the principal components of the spectra and the photometric system 
-they are common for all objects. The only additional information needed for each object is 
the photometry and its expected uncertainties. Each spectrum is reconstructed by computing 
the b vector, calculating G~^b, and using these numbers in the linear combination of Eq. 

It is also of interest to note that the solution to Eq. (|9]) can alternatively be easily carried 
out using the SVD of the matrix $W^. We have verified empirically that this matrix of 
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size N X M fulfills that rank($W''') = A^. Thus, in order to reconstruct the spectrum using 
principal components, we need to have, at least, N measurements. Since we have only 
available the g, r and i magnitudes, we c annot expect to recons truct the spectrum using the 



four principal components tabulated by iMcGurk et al.l (120101 ) from only 3 measurements 



As a consequence, we have to limit the reconstruction to only the average spectrum plus two 
principal components, leading to slightly larger errors. 

Summarizing, from the knowledge of the system response at each wavelength, the prin- 
cipal components associated to the g — r bin of the star and the g, r, and i magnitudes, one 
is able to reconstruct the stellar spectrum by solving the linear system of Eq. f|T6l) and using 
the coefficients in the linear expansion given by Eq. ffTOl) . 



3. Demonstration of the technique 

We carry out reconstructions using magnitudes synthesized from observed spectra in- 
cluded in SDSS Data Release 7. The synthetic magnitudes are obtained following Eq. (jl]) for 
filters g, r and i. These could be taken directly from the photometric observations instead. 
The sensing matrix is built using Eq. ([8]). The spectrum of four representative stars from the 
sample are reconstructed solving the linear system of Eq. flTBl) . The success of the technique 
is shown in Fig. [21 As stated before, to this aim we make use only of the mean spectrum 
and two principal components. The original noisy spectrum is shown in black color. The 
projection of the observed spectrum on the space spanned by the first three principal com- 
ponents is shown in red color. This constitutes the best possible reconstruction that we 
could achieve with the presented method. The spectrum reconstructed with our technique 
using the magnitude in the filters g, r, and i is shown in blue. Note that the reconstruction 
closely follows the red curve, indicating that a good reconstruction is possible and that the 
projection along the first three principal components can be obtained reliably from the linear 
measurements made with the SDSS filters. 

The fundamental characteristics of the spectra are reproduced with precision, making 
it possible to empirically infer spectroscopic quantities using photometric measurements. At 
the same time, thanks to the projection along the principal compon ents, the spectrum recon 



struc ted from gri magnitudes is automatically denoised (see, e.g.. iMartmez Gonzalez et al. 



20081 . for the denoising capabilities of PGA). Particularly large residuals are visible at specific 
wavelengths in Fig. [2l In panel a) we can spot some issues with sky removal at the green [01] 
line (5577 A) and in the IR end. In panels b), c) and d) there are strong residuals around the 
transitions of the Balmer series and other strong features. These residuals change sign very 
quickly around the central wavelength, signaling a horizontal offset between the original and 
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the reconstructed spectra, most likely related to the Dopper velocity shifts, which have not 
been corrected but happen to be small enough for the star depicted in panel a). 

As four examples are not statistically relevant, we have carried out an analysis of the 
differences between the reconstructed and original spectra for 500 stars chosen at random 



from the seventh data release of the SDSS (jAbazajian et al.ll2009l ). We have verified that a 



sample of this size chosen at random covers all g — r bins and provides reliable statistics. 
The quality is characterized, at each wavelength, by the 5th, 50th and 95th percentile of 
the distribution of relative errors from the reconstructed spectrum and the original one. 
The results are shown in the upper-left panel of Fig. [31 As stated, the 50th percentile 
(the median) is represented as a black curve and indicates that half of the stars can be 
reconstructed with relative errors smaller than 2% (from ~4200 to 9000 A). The 5th and 
95th percentiles (blue and red curves) are also indicated in the upper left panel of Fig. El 
It is demonstrated that the reconstructions can be done with relative errors well below 10% 
for 95% of the stars. Of course, this does not rule out the presence of 5% of the stars with 
relative reconstruction errors potentially larger than 10%. 

For reference, in Fig. E] reconstructions with the first t hree (lower left panel ) and four 



(lower right panel) principal components as obtained from iMcGurk et al.l (|2010[ ) are com- 
pared with the original spectra. These plots summarize the quality of the PGA recreations. 
Although PCA reconstructions with relative errors below or of the order of 1% are possible 
for 50% of the stars, a fraction of stellar spectra will incur (even knowing exactly the pro- 
jections along the principal components) relative errors larger than 10%. Note also that the 
improvement on the reconstruction is marginal when using four instead of three principal 
components. An indication of the quality of the PCA reconstruction is that the difference 
between the initial magnitudes and the reconstructed magnitudes has a standard deviation 
of 0.008 for g, r and i when using 4 principal components. When using only one principal 
component, this number increases up to 0.03 for g and r and to 0.05 for i. 

It is important to realize that the residuals shown in Fig. [3] are significantly higher for 
wavelengths with lines than in continuum regions. This suggests that the PCA reconstruction 
is reproducing well the continuum shape, but not so the lines' strength. However, lines are 
crucial for recovering information on surface gravity and chemical composition. A possibility 
for improvement is therefore to perform PCA on continuum-corrected spectra. 

If instead of using the original spectra as reference, reconstructions are compared with 
the spectra projected onto the space spanned by the first three principal components, the 
results are those shown in the upper right panel of Fig. [31 It is clear from this plot that our 
method is able to reliably extract the projection along the first three principal components 
from photometric information. 
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In order to analyze the quality of reconstructions as a function of the spectral type, we 
show in Fig. IHthe relative error between the reconstruction and the original noisy spectrum 
for stars in different bins of g — r. As expected, we note that our reconstruction leads to 
slightly worse results in cooler stars. This is a consequence of the fact that the spectrum 
of cool stars is relatively more complex and the photometry is not able to capture their full 
variation. In any case, even in the less favorable case for stars with g — r > 0.6 {Tes ^ 5000 
K), reconstructions are below 10% for 95% of the stars in a large wavelength range. 



4. Influence of errors 

Observed magnitudes are always inherently accompanied by an error bar. It is important 
to quantify the effect of this error on the reconstruction of the spectrum. Assuming Gaussian 
errors, an error bar of standard deviation in magnitudes at filter j translates into an 
error bar in the flux at the same filter of: 

= 0.4 In lOoja^^ ^ 0.92o,a^^, . (18) 

Even if we assume that the error bars of the observations are not correlated, the resulting 
error bars for the projections along the mean spectrum and the principal components are 
correlated. Assuming that the matrix G is noise-fre^, error propagation in the solution of 
the linear system of Eq. f lT6|) leads to the following formula for the covariance matrix of the 
projection along the principal components: 

C„ = G-'C,{G-y, (19) 

where 

C, = TCoTt, (20) 

where Tjj = tj/al.. For simplicity, we assume that the correlation matrix of the observed 
fluxes is diagonal and the diagonal elements are computed from Eq. f|T8|) . Finally, the 
covariance matrix for the reconstructed spectrum is given by: 

Cf = W^CaW. (21) 



^This implies assuming that the filter and atmospheric transmissions are known with absolute certainty. 
Obviously, this can be relaxed without too much effort, although the final expression for the covariance matrix 
of the projection a-long the principal components co ntains another contribution due to the uncertainties in 
the G matrix fsee lAsensio Ramos fc Colladosll20Q8l . for an example in another field). 
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It is difficult to characterize the sensitivity to errors in the observed magnitudes because 
of the large variability in the stellar fluxes. For presentation purposes and to give a rough 
estimation, let us assume that all Sloan magnitudes have = 0.03 mag, which is represen- 
tative of more than 95% of the observed stars. Likewise, let us pick a representative value 
for the flux at each filter oj as the average in each bin. Following the previous expressions, 
we show in Figure [5] the standard deviation of the error in the projection along the principal 
components (equivalent to the diagonal elements cov(aj,aj)) normalized to the product of 
the observational error in the magnitude and the mean projection along the mean spectrum. 
It has been estimated for the average flux in each bin. The results indicate that the relative 
error in the projection along the principal components is roughly similar to 0"^- In the SDSS 
database, typical errors range from 0.01 to 0.05 mag, with more than 95% of the stars with 
errors less than 0.03 mag. Therefore, relative errors of 1-3% are induced in the reconstruction 
due to the presence of uncertainties in the observed magnitudes. 



The principal components of iMcGurk et al.l (120101 ) have been computed by shifting all 



spectra to a zero radial velocity common wavelength axis. Therefore, the reconstructions we 
carry out give as an output the spectrum at zero radial velocity and contain no information on 
radial velocities. However, the fluxes measured photometrically with the gri ffiters contain 
the influence of the radial velocity. This effect cannot be compensated for and it is not 
clear whether this might have an influence on the final reconstruction. From SDSS data, 
the distribution of radial velocities induce Doppler shifts that are, with ~95% probability, 
smaller than 6 pixels (~414 km/s). According to the reconstructions shown in Figs. |5]and|ni 
that were performed with the original spectrum, Doppler shifts increase errors in the lines, 
as suggested by the antisymmetric residuals in the strongest lines of panels b) c) and d) in 
Fig. [2] Such errors are masked in Fig. [3] by the symmetrization induced by averaging out 
over many stars. Therefore, although radial velocities cannot be estimated with our method, 
its effect on the quality of the reconstruction is not very important, except for spectral lines. 



5. Conclusions 

We have presented a method to reconstruct stellar spectra from SDSS photometry that 
is fast, efficient and reliable. Although it might sound magical, this reconstruction is made 
possible thanks to the sparsity of stellar spectra, which can be expressed as a linear combina- 
tion of a few eigenspectra obtained from a principal component decomposition. Our method 
returns the projection of the observed spectrum onto the space spanned by the first three 
principal components just from the photometric g, r and i magnitudes. As a consequence, 
the resulting spectrum is simultaneously denoised and reconstructed. We have analyzed the 
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statistical properties of the regenerated spectra and verified that the residuals are roughly 
compatible with the noise present in the observations, albeit they are not random. We have 
also analyzed the influence of observational errors in the magnitudes and the presence of 
non-zero radial velocities on the reconstruction. Both of them produce very small effects on 
the performance of our algorithm. 



Recently, iMcGurk et al.l (120 10[ ) has investigated the possible correlation between stellar 
parameters and the projections of the spectrum along the principal components. Since 
PGA is a linear technique and the stellar parameters are typically nonlinear combinations 
of parts of the obser ved spectrum, a strong correlation is not to be expected, as shown by 



McGurk et al.l ( 120101 ). In our case though, we are able to reconstruct the full spectrum from 



photometric observables. This opens up the possibility of applying standard techniques for 
inferring stellar parameters to the reconstructed spectrum. 
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Wavelength [A] 

Fig. 1. — Sample observed spectrum of an F-type star with [Fe/H] ^ —1.6, Tcs — 5886 K and 
logg — 4.61. The vertical axis is in flux units normalized to the maximum in the observed 
spectral region. We also show the total efficiency including atmospheric transmission for 
the SDSS filter set at 1.3 air masses for point-like objects. Note that filters u and z have 
contributions outside from the observed spectral region, so that they cannot be used for 
reconstruction. 
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Fig. 2. — Four examples of the spectrum reconstruction using simulated magnitudes in filters 
g, r and i. The original spectrum is shown in black. The s pectrum recons t ructed from the 
exact projections along the first 3 principal components of iMcGurk et al.l ( l2010l ) is shown 
in red. This constitutes the best reconstruction of the spectrum we can achieve with our 
method. The spectrum reconstructed using our scheme is shown in blue, with the residual 
indicated in the lower subpanel of each panel. Note the similarity between the red and blue 
curves. 
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Fig. 3. — Statistical comparison showing the viability of the recons truction scheme for 5000 
stars chosen at random from the database of iMcGurk et al.l (120101 ) . We show the 5th (blue 
curve), 50th (black curve) and 95th (red curve) percentiles of the relative error distribution 
for each wavelength. The upper left panel presents the relative error obtained between 
the reconstructed spectrum using our method and the original noisy spectrum. It is clear 
that 50% of the stars have relative errors below 1-2% while relative errors are smaller than 
10% with 95% probability. The upper right panel shows the relative error between the 
reconstruction using photometric data and the denoised spectrum computed with the exact 
projections along the first three principal components. For reference, the lower panels show 
the comparison between the original spectra and the ones reconstructed using the exact 
projections along the first three (lower left panel) and the first four (lower right panel) 
principal components. 
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Fig. 4. — Relative error between the reconstructed spectrum using our method and the 
original noisy spectrum for the sample separated in color bins. Note that reconstructions 
of stars with higher effective temperatures (smaller g — r) are of better quality due to the 
small number of spectral features. On the contrary, cooler stars (larger g — r) tend to have 
molecular bands that difficult the reconstruction. In any case, even in the least favourable 
case, 50% of the stars have relative errors below 4-5% while relative errors are smaller than 
20% with 95% probability (with a large wavelength region with errors below 10% for 95% of 
the stars). The colors have the same meaning as those of Fig. [31 
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Fig. 5. — Standard deviation of the error in the reconstruction of the projection of the 
spectrum along the mean spectrum and the first two principal components for each bin. The 
color g — r for each bin is also indicated in the upper axis. The error is normalized to the 
quantity cr^ and estimated using the average flux in each fllter for each bin. 



